AITopics | mean and unit variance

Collaborating Authors

mean and unit variance

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Self-Normalizing Neural Networks

Neural Information Processing SystemsMar-17-2026, 14:36:30 GMT

Deep Learning has revolutionized vision via convolutional neural networks (CNNs) and natural language processing via recurrent neural networks (RNNs). However, success stories of Deep Learning with standard feed-forward neural networks (FNNs) are rare. FNNs that perform well are typically shallow and, therefore cannot exploit many levels of abstract representations. We introduce self-normalizing neural networks (SNNs) to enable high-level abstract representations. While batch normalization requires explicit normalization, neuron activations of SNNs automatically converge towards zero mean and unit variance. The activation function of SNNs are scaled exponential linear units (SELUs), which induce self-normalizing properties.

artificial intelligence, machine learning, normalization, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

EstimatingRank-OneSpikesfrom Heavy-Tailed NoiseviaSelf-AvoidingWalks

Neural Information Processing SystemsFeb-8-2026, 03:55:57 GMT

The main question about the spiked Wigner model is: how large should the signal-to-noise ratio λ > 0 be in order to achieve constant correlation withx?

algorithm, artificial intelligence, vertex, (15 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)

Technology: Information Technology > Artificial Intelligence (0.67)

Add feedback

Self-Normalizing Neural Networks

Neural Information Processing SystemsNov-21-2025, 15:21:24 GMT

name change, normalization, self-normalizing neural network, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Self-Normalizing Neural Networks

Günter Klambauer, Thomas Unterthiner, Andreas Mayr, Sepp Hochreiter

Neural Information Processing SystemsNov-21-2025, 09:33:10 GMT

We introduce self-normalizing neural networks (SNNs) to enable high-level abstract representations.

artificial intelligence, machine learning, variance, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Austria > Upper Austria > Linz (0.04)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Self-Normalizing Neural Networks

Günter Klambauer, Thomas Unterthiner, Andreas Mayr, Sepp Hochreiter

Neural Information Processing SystemsOct-3-2024, 11:28:26 GMT

Deep Learning has revolutionized vision via convolutional neural networks (CNNs) and natural language processing via recurrent neural networks (RNNs). However, success stories of Deep Learning with standard feed-forward neural networks (FNNs) are rare. FNNs that perform well are typically shallow and, therefore cannot exploit many levels of abstract representations. We introduce self-normalizing neural networks (SNNs) to enable high-level abstract representations. While batch normalization requires explicit normalization, neuron activations of SNNs automatically converge towards zero mean and unit variance. The activation function of SNNs are "scaled exponential linear units" (SELUs), which induce self-normalizing properties.

activation, neural network, variance, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Austria > Upper Austria > Linz (0.04)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

FedScalar: A Communication efficient Federated Learning

Rostami, M., Kia, S. S.

arXiv.org Artificial IntelligenceOct-3-2024

Federated learning (FL) has gained considerable popularity for distributed machine learning due to its ability to preserve the privacy of participating agents by eliminating the need for data aggregation. Nevertheless, communication costs between agents and the central server in FL are substantial in large-scale problems and remain a limiting factor for this algorithm. This paper introduces an innovative algorithm, called \emph{FedScalar}, within the federated learning framework aimed at improving communication efficiency. Unlike traditional FL methods that require agents to send high-dimensional vectors to the server, \emph{FedScalar} enables agents to communicate updates using a single scalar. Each agent encodes its updated model parameters into a scalar through the inner product between its local update difference and a random vector, which is then transmitted to the server. The server decodes this information by projecting the averaged scalar values onto the random vector. Our method thereby significantly reduces communication overhead. Technically, we demonstrate that the proposed algorithm achieves a convergence rate of $O(1/\sqrt{K})$ to a stationary point for smooth, non-convex loss functions. Additionally, our analysis shows that altering the underlying distribution of the random vector generated by the server can reduce the variance during the aggregation step of the algorithm. Finally, we validate the performance and communication efficiency of our algorithm with numerical simulations.

algorithm, server, variance, (14 more...)

arXiv.org Artificial Intelligence

2410.0226

Country:

North America > United States > California > Orange County > Irvine (0.14)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Estimating time-varying input signals and ion channel states from a single voltage trace of a neuron

Neural Information Processing SystemsMar-15-2024, 14:54:56 GMT

State-of-the-art statistical methods in neuroscience have enabled us to fit mathematical models to experimental data and subsequently to infer the dynamics of hidden parameters underlying the observable phenomena. Here, we develop a Bayesian method for inferring the time-varying mean and variance of the synaptic input, along with the dynamics of each ion channel from a single voltage trace of a neuron. An estimation problem may be formulated on the basis of the state-space model with prior distributions that penalize large fluctuations in these parameters. After optimizing the hyperparameters by maximizing the marginal likelihood, the state-space model provides the time-varying parameters of the input signals and the ion channel states. The proposed method is tested not only on the simulated data from the Hodgkin Huxley type models but also on experimental data obtained from a cortical slice in vitro.

input signal, ion channel state, synaptic input, (13 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.05)
North America > United States > New York (0.04)
Europe > Czechia > Prague (0.04)
Asia > Japan > Honshū > Kantō > Saitama Prefecture > Saitama (0.04)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Add feedback

Batch Norm Explained Visually -- How it works, and why neural networks need it

#artificialintelligenceApr-11-2023, 09:50:58 GMT

Batch Norm is an essential part of the toolkit of the modern deep learning practitioner. Soon after it was introduced in the Batch Normalization paper, it was recognized as being transformational in creating deeper neural networks that could be trained faster. Batch Norm is a neural network layer that is now commonly used in many architectures. It often gets added as part of a Linear or Convolutional block and helps to stabilize the network during training. In this article, we will explore what Batch Norm is, why we need it and how it works.

batch norm, mean and variance, variance, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

First Steps Toward Understanding the Extrapolation of Nonlinear Models to Unseen Domains

Dong, Kefan, Ma, Tengyu

arXiv.org Artificial IntelligenceDec-1-2022

Real-world machine learning applications often involve deploying neural networks to domains that are not seen in the training time. Hence, we need to understand the extrapolation of nonlinear models -- under what conditions on the distributions and function class, models can be guaranteed to extrapolate to new test distributions. The question is very challenging because even two-layer neural networks cannot be guaranteed to extrapolate outside the support of the training distribution without further assumptions on the domain shift. This paper makes some initial steps toward analyzing the extrapolation of nonlinear models for structured domain shift. We primarily consider settings where the marginal distribution of each coordinate of the data (or subset of coordinates) does not shift significantly across the training and test distributions, but the joint distribution may have a much bigger shift. We prove that the family of nonlinear models of the form $f(x)=\sum f_i(x_i)$, where $f_i$ is an arbitrary function on the subset of features $x_i$, can extrapolate to unseen distributions, if the covariance of the features is well-conditioned. To the best of our knowledge, this is the first result that goes beyond linear models and the bounded density ratio assumption, even though the assumptions on the distribution shift and function class are stylized.

artificial intelligence, machine learning, theorem 3, (16 more...)

arXiv.org Artificial Intelligence

2211.11719

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Washington > King County > Seattle (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Lecture Notes in Deep Learning: Activations, Convolutions, and Pooling -- Part 2

#artificialintelligenceJun-28-2020, 23:57:30 GMT

These are the lecture notes for FAU's YouTube Lecture "Deep Learning". This is a full transcript of the lecture video & matching slides. We hope, you enjoy this as much as the videos. Of course, this transcript was created with deep learning techniques largely automatically and only minor manual modifications were performed. If you spot mistakes, please let us know!

artificial intelligence, deep learning, machine learning, (16 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback